Value Function Approximation in Reinforcement Learning Using the Fourier Basis
نویسندگان
چکیده
We describe the Fourier basis, a linear value function approximation scheme based on the Fourier series. We empirically demonstrate that it performs well compared to radial basis functions and the polynomial basis, the two most popular fixed bases for linear value function approximation, and is competitive with learned proto-value functions.
منابع مشابه
Basis Function Adaptation in Temporal Difference Reinforcement Learning
We examine methods for on-line optimization of the basis function for temporal difference Reinforcement Learning algorithms. We concentrate on architectures with a linear parameterization of the value function. Our methods optimize the weights of the network while simultaneously adapting the parameters of the basis functions in order to decrease the Bellman approximation error. A gradient-based...
متن کاملAdaptive Bases for Reinforcement Learning
We consider the problem of reinforcement learning using function approximation, where the approximating basis can change dynamically while interacting with the environment. A motivation for such an approach is maximizing the value function fitness to the problem faced. Three errors are considered: approximation square error, Bellman residual, and projected Bellman residual. Algorithms under the...
متن کاملRates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action value function, Q. Theory is presented showing that linear function approximation representations of Q can degrade the rate of convergence of performance gradient estimates by a factor of O(ML) relative to when no func...
متن کاملShaping Proto-Value Functions Using Rewards
In reinforcement learning (RL), an important sub-problem is learning the value function, which is chiefly influenced by the architecture used to represent value functions. Often, the value function is expressed as a linear combination of a pre-selected set of basis functions. These basis functions are either selected in an ad-hoc manner or are tailored to the RL task using the domain knowledge....
متن کاملBasis function construction for hierarchical reinforcement learning
Much past work on solving Markov decision processes (MDPs) using reinforcement learning (RL) has relied on combining parameter estimation methods with hand-designed function approximation architectures for representing value functions. Recently, there has been growing interest in a broader framework that combines representation discovery and control learning, where value functions are approxima...
متن کامل